Enhanced speech coding based on phonetic class segmentation

نویسندگان

  • Adriane Swalm Durey
  • Venkatesh Krishnan
  • Thomas P. Barnwell
چکیده

Given a baseline speech coder and speech with an available phonetic class segmentation, a number of potential enhancements to that coder become possible. While the quality of speech segmentation by phoneme and phonetic class is constantly improving, we use TIMIT to generate phonetic class segmentation as a basis for initial testing of these techniques. Using coders drawn from the MELP family, we explore specialized phonetic codebooks, phonetically-driven superframing, and improved modeling of specific phonetic classes and the transitions between them. We compare the reconstructed speech from these enhancements against the base coder using the metrics of computational cost, transmission cost, and the quality of the reconstructed speech. In most cases, we find that segmentation-based coders can produce speech with quality comparable to that of MELP, using fewer transmitted bits and at no additional computational cost. With phonetic codebooks and transition modeling, CCR tests show these segmentation-based coders produce speech of better quality than is produced by MELP.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic pitch-synchronous phonetic segmentation

This paper deals with an HMM-based automatic phonetic segmentation (APS) system and proposes to increase its performance by employing a pitch-synchronous (PS) coding scheme. Such a coding scheme uses different frames of speech throughout voiced and unvoiced speech regions and enables thus better modelling of each individual phone. The PS coding scheme is shown to outperform the traditionally ut...

متن کامل

Qualitative Evaluation and Error Analysis of Phonetic Segmentation

Speech segmentation is the process of splitting and identifying the boundaries between different units of speech, i.e., words, syllables, and phones. This paper focuses on the automatic phonetic segmentation of speech and the methods used for its evaluation. We explain the current methods used for the evaluation of speech segmentation and highlight the details that have not been sufficiently ad...

متن کامل

Improving the robustness of phonetic segmentation to accent and style variation with a two-staged approach

Correct and temporally accurate phonetic segmentation of speech utterances is important in applications ranging from transcription alignment to pronunciation error detection. Automatic speech recognizers used in these tasks provide insufficient temporal alignment accuracy apart from a recognition performance that is sensitive to accent and style variations from the training data. A two-staged a...

متن کامل

Automatic Phonetic Segmentation for a Speech Corpus of Hebrew

This paper presents our study on different phonetic segmentation methods based on hidden Markov models evaluated against a Hebrew speech corpus. We investigated methods for fully automatic phonetic segmentation using only the corpus which should be segmented and automatically generated phonetic transcriptions. A new method for phonetic boundary correction based on spectral variation of the spee...

متن کامل

Improvements on Automatic Speech Segmentation at the Phonetic Level

In this paper, we present some recent improvements in our automatic speech segmentation system, which only needs the speech signal and the phonetic sequence of each sentence of a corpus to be trained. It estimates a GMM by using all the sentences of the training subcorpus, where each Gaussian distribution represents an acoustic class, which probability densities are combined with a set of condi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005